Genetic Algorithms in Partitional Clustering: A Comparison
نویسندگان
چکیده
Three approaches to partitional clustering using genetic algorithms (GA) are compared with k-means and the EM algorithm for three real world datasets (Iris, Glass and Vowel). The GA techniques differ in their encoding of the clustering problem using either a class id for each object (GAIE), medoids to assign objects to the class associated with the nearest medoid (GAME), or parameters for multivariate distributions that describe each cluster (GAPE). For the simple Iris dataset, all algorithms except GAIE obtained results with comparable accuracy, but k-means and EM had more runs with inferior results compared to GAME and GAPE. For the more complex Glass dataset, the results for GAME and GAPE were superior compared to k-means, EM and GAIE regarding their accuracy and variance of the results for repeated runs. None of the algorithms was superior for the most complex dataset (Vowel). We conclude that GAs in clustering are a valuable alternative to k-means and EM, but that the choice of the problem representation is crucial. Key-words: partitional cluster analysis, genetic algorithms, optimisation in clustering, k-means, Expectation Maximisation (EM).
منابع مشابه
خوشهبندی خودکار دادهها با بهرهگیری از الگوریتم رقابت استعماری بهبودیافته
Imperialist Competitive Algorithm (ICA) is considered as a prime meta-heuristic algorithm to find the general optimal solution in optimization problems. This paper presents a use of ICA for automatic clustering of huge unlabeled data sets. By using proper structure for each of the chromosomes and the ICA, at run time, the suggested method (ACICA) finds the optimum number of clusters while optim...
متن کاملComparison of Agglomerative and Partitional Document Clustering Algorithms
Fast and high-quality document clustering algorithms play an important role in providing intuitive navigation and browsing mechanisms by organizing large amounts of information into a small number of meaningful clusters, and in greatly improving the retrieval performance either via cluster-driven dimensionality reduction, term-weighting, or query expansion. This ever-increasing importance of do...
متن کاملDifferential evolution and particle swarm optimisation in partitional clustering
In recent years, many partitional clustering algorithms based on genetic algorithms (GA) have been proposed to tackle the problem of finding the optimal partition of a data set. Surprisingly, very few studies considered alternative stochastic search heuristics other than GAs or simulated annealing. Two promising algorithms for numerical optimization, which are hardly known outside the heuristic...
متن کاملAssessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories
In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...
متن کاملK-attractors: a Partitional Clustering Algorithm for Numeric Data Analysis
Clustering is a data analysis technique, particularly useful when there are many dimensions and little prior information about the data. Partitional clustering algorithms are efficient, but suffer from sensitivity to the initial partition and noise. We propose here k-Attractors, a partitional clustering algorithm tailored to numeric data analysis. As a pre-processing (initialization) step, it e...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010